Adds information about cooldown periods for trained model autoscaling in Serverless #2498

kosabogi · 2025-08-11T12:15:53Z

This PR adds information about cooldown periods for trained model autoscaling in serverless projects.

Changes

Related issue: https://github.com/elastic/docs-content-internal/issues/177

github-actions · 2025-08-11T12:17:57Z

🔍 Preview links for changed docs

kilfoyle

LGTM! 🦖
Very nice!

deploy-manage/cloud-organization/billing/elasticsearch-billing-dimensions.md

ppf2 · 2025-08-14T15:06:02Z

@prwhelan Can you review for technical accuracy? Thx!

prwhelan · 2025-08-14T21:54:35Z

deploy-manage/cloud-organization/billing/elasticsearch-billing-dimensions.md

+    * When using the inference API for {{es}} or ELSER, [enable `adaptive_allocations`](../../autoscaling/trained-model-autoscaling.md#enabling-autoscaling-through-apis-adaptive-allocations).
+
+    ::::{note}
+    In {{serverless-short}}, trained model deployments scale down to zero only after 24 hours without any inference requests. After scaling up, they remain active for 5 minutes before they can scale down again. During these cooldown periods, you will continue to be billed for the active resources.


This is true outside of serverless as well. All environments will now wait 24 hours before scaling to zero: elastic/elasticsearch#128914

Outside of serverless, this can be modified using xpack.ml.trained_models.adaptive_allocations.scale_to_zero_time to a minimum of one minute.

Hi @prwhelan, thanks a lot for your feedback! I've modified my PR based on it, along with a few other smaller changes:

Trained model autoscaling: I moved the cooldown period information into its own heading. This makes it easier to highlight and also allows other pages to link directly to this specific section.

Autoscaling: I felt that going into the details of cooldown periods here would be out of scope and make the page a bit overwhelming. Instead, I added a more concise sentence that links to the new Cooldown periods section on the Trained model autoscaling page.

Elasticsearch billing dimensions: Realizing that this page is only applicable to Serverless, I updated the description for the Machine learning trained model autoscaling bullet point to reflect the new autoscaling behavior in Serverless.

Please let me know if you think these changes are appropriate or if you’d like me to adjust anything.
Thanks again!

prwhelan

Hey, sorry for a last minute change. We are lowering the default value from 24 hours to 4 hours, and we are adding a maximum value of 72 hours:
elastic/elasticsearch#133355

Please ignore :)

We will eventually reduce this to 4 hours, but I will update the documentation when that time comes.

Adds information about cool down periods for Trained models autoscaling

2e1c212

kosabogi requested a review from ppf2 August 11, 2025 12:15

kosabogi requested a review from a team as a code owner August 11, 2025 12:15

kosabogi added the documentation Improvements or additions to documentation label Aug 11, 2025

kilfoyle approved these changes Aug 11, 2025

View reviewed changes

shainaraskas reviewed Aug 11, 2025

View reviewed changes

deploy-manage/cloud-organization/billing/elasticsearch-billing-dimensions.md Outdated Show resolved Hide resolved

kosabogi and others added 2 commits August 14, 2025 07:27

Merge branch 'main' into cooldown-periods

89317a2

Removes applies_to tags

a3e34a4

ppf2 requested a review from prwhelan August 14, 2025 15:05

prwhelan approved these changes Aug 14, 2025

View reviewed changes

prwhelan reviewed Aug 14, 2025

View reviewed changes

kosabogi and others added 2 commits August 15, 2025 11:34

Applies suggestions

6886086

Merge branch 'main' into cooldown-periods

5c08f8f

prwhelan reviewed Aug 21, 2025

View reviewed changes

kosabogi added 2 commits August 25, 2025 07:34

Merge branch 'main' into cooldown-periods

42cc91e

Merge branch 'main' into cooldown-periods

f5035a7

kosabogi merged commit 1eb144c into main Aug 25, 2025
7 checks passed

kosabogi deleted the cooldown-periods branch August 25, 2025 12:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adds information about cooldown periods for trained model autoscaling in Serverless #2498

Adds information about cooldown periods for trained model autoscaling in Serverless #2498

Uh oh!

kosabogi commented Aug 11, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Aug 11, 2025 •

edited

Loading

Uh oh!

kilfoyle left a comment

Uh oh!

Uh oh!

ppf2 commented Aug 14, 2025

Uh oh!

prwhelan Aug 14, 2025

Uh oh!

kosabogi Aug 15, 2025

Uh oh!

prwhelan left a comment •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Adds information about cooldown periods for trained model autoscaling in Serverless #2498

Adds information about cooldown periods for trained model autoscaling in Serverless #2498

Uh oh!

Conversation

kosabogi commented Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Uh oh!

github-actions bot commented Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔍 Preview links for changed docs

Uh oh!

kilfoyle left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ppf2 commented Aug 14, 2025

Uh oh!

prwhelan Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

kosabogi Aug 15, 2025

Choose a reason for hiding this comment

Uh oh!

prwhelan left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

kosabogi commented Aug 11, 2025 •

edited

Loading

github-actions bot commented Aug 11, 2025 •

edited

Loading

prwhelan left a comment •

edited

Loading